Term paper: Randomized Algorithms and Heuristics for Join Ordering
نویسنده
چکیده
In the relational database setting today, large queries containing many joins are becoming increasingly common. In general the ordering of join-operations is quite sensitive and can have a devastatingly negative effect on the efficiency of the DBMS. Scheufele and Moerkotte proved that join-ordering is NP-complete in the general case [4]. For smaller queries however, less than approximately 10 joins, the optimal join strategy can be found by means of dynamic programming. However, the dynamic programming algorithm, proposed in [5], has a worst case running time of O(2 ) (where N is the number of joins), thus for queries with more than 10 joins, it becomes infeasible. In the literature there are many alternative approaches to the join ordering problem, Steinbrunn et al [6] present a good overview. Approaches such as Iterative Improvement, Simulated Annealing, Genetic Algorithms, Two phase optimization and the probabilistic QuickPick etc all provide efficient alternatives although producing sub-optimal solutions to the join-ordering problem. In the next section a presentation to each of these will be made.
منابع مشابه
An experimental study on the complexity
Not only in deductive databases, logic programming, and constraint satisfaction problems but also in object bases where each single dot in a path expression corresponds to a join, the optimizer is faced with the problem of ordering large numbers of joins. This might explain the renewed interest in the join ordering problem. Although many join ordering techniques have been invented and benchmark...
متن کاملJoin Ordering for Constraint Handling Rules: Putting Theory into Practice
Join ordering is the NP-complete problem of finding the optimal order in which the different conjuncts of multi-headed rules are joined. Join orders are the single most important determinants for the runtime complexity of CHR programs. Nevertheless, all current systems use ad-hoc join ordering heuristics, often using greedy, very error-prone algorithms. As a first step, Leslie De Koninck and Jo...
متن کاملRepeated Record Ordering for Constrained Size Clustering
One of the main techniques used in data mining is data clustering, which has many applications in computer science, biology, and social sciences. Constrained clustering is a type of clustering in which side information provided by the user is incorporated into current clustering algorithms. One of the well researched constrained clustering algorithms is called microaggregation. In a microaggreg...
متن کاملProcessing Sliding Window Multi-Joins in Continuous Queries over Data Streams
We study sliding window multi-join processing in continuous queries over data streams. Several algorithms are reported for performing continuous, incremental joins, under the assumption that all the sliding windows fit in main memory. The algorithms include multiway incremental nested loop joins (NLJs) and multi-way incremental hash joins. We also propose join ordering heuristics to minimize th...
متن کاملAutomata Theory based Approach to the Join Ordering Problem in Relational Database Systems
The join query optimization problem has been widely addressed in relational database management systems (RDBMS). The problem consists of finding a join order that minimizes the time required to execute a query. Many strategies have been implemented to solve this problem including deterministic algorithms, randomized algorithms, meta-heuristic algorithms and hybrid approaches. Such methodologies...
متن کامل